Parameterising Jobscripts

Now that we know how to set a script directly, we can at least run jobs on the nodes.

However this method is rather inflexible, and it would be nice to be able to parameterise these scripts in some way.

For that, we need to enable some level of parameterisation.

Templates

The simplest way of doing this is to use a template.

Assuming you have a jobscript already, you’re most of the way there.

Lets take our script from earlier and create a template from it.

To do this, we just need to identify anything that we would want to change, and add a placeholder for a parameter.

Placeholders

These placeholders have a specific format. remotemanager will create value entries for you, by searching for anything of the form #VALUE#.

For example, for our nodes parameter, we should change the line

#SBATCH --nodes=4

to

#SBATCH --nodes=#NODES#

Lets do the whole script:

[2]:
jobscript_template = """#!/bin/bash

#SBATCH --ntasks-per-node=#TASKS_PER_NODE#
#SBATCH --cpus-per-task=#CPUS_PER_TASK#
#SBATCH --nodes=#NODES#
#SBATCH --queue=#QUEUE#
#SBATCH --account=#ACCOUNT#
#SBATCH --walltime=#TIME:format=time:default=3600#
#SBATCH --exclusive

#MODULES#"""

Now we have a parameterised jobscript, how do we use it?

This template can be used by a Computer (or Script) class to generate a script based on your inputs.

Lets go with Computer, since it’s the more commonly used one:

[3]:
from remotemanager import Computer

conn = Computer(template=jobscript_template)

print(conn.script(tasks_per_node=16, cpus_per_task=4, nodes=12, queue="standard", account="test"))
#!/bin/bash

#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=4
#SBATCH --nodes=12
#SBATCH --queue=standard
#SBATCH --account=test
#SBATCH --walltime=01:00:00
#SBATCH --exclusive

Note

Note that arguments are always lower case. Even if you specify uppercase in the actual parameterisation.

Note

Don’t worry too much about this script function. This will be called for you when you’re using a Dataset; you don’t have to generate the scripts yourself.

Now that we have a script that uses parameters, we should cover the ways in which you can alter those params.

Generalisation

It’s important to note that these parameters can be anything you want.

As an example of this, we’re parameterising our module load with a #MODULE# parameter.

This means that we can dynamically change the modules by passing

conn.modules = "module load ..."

We will come back and use this template later in the tutorial. For now, we should cover the various ways you can control how these values behave.

Keyword Arguments

If we take a closer look at the walltime attribute, notice that :format=time:default=3600 string? These are keyword args.

In this case, we are specifying that the output should be formatted as though it is a time string, with a default of 3600 (seconds).

Tip

Add keyword args just like you would in a normal python call, separated by a : character.

A complete list of kwargs is provided below:

default

This allows you to set the default value of a parameter. If not specified, and no value is given, then the empty_treatment behaviour applies (shown below)

[4]:
template = "a = #a:default=10#"

conn = Computer(template=template)

print(conn.script())
a = 10

value

You can also set the value directly. This is roughly equivalent to setting default, but has a higher priority. For example if you set #PARAM:default="foo":value="bar", then “bar” will take priority, rendering the default essentially useless.

optional

Set to False to enforce that a value is present.

In the following example, we are allowed to give no value for optional, however we will get an error if we try to generate a script without specifying required

[5]:
template = """
optional = #OPT#
required = #REQ:optional=False#
"""

conn = Computer(template=template)

print(conn.script())
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], line 8
      1 template = """
      2 optional = #OPT#
      3 required = #REQ:optional=False#
      4 """
      6 conn = Computer(template=template)
----> 8 print(conn.script())

File ~/Work/Devel/remotemanager/remotemanager/script/script.py:319, in Script.script(self, empty_treatment, **run_args)
    317 # check validity
    318 if not self.valid:
--> 319     raise ValueError(f"Missing values for parameters:\n{self.missing}")
    320 # generation section
    321 self._link_subs()  # ensure values are properly linked

ValueError: Missing values for parameters:
['req']

Optional Properties

You can see the required values at any moment by accessing the required property of your Computer.

Alternatively, the missing property shows what you still need to specify.

Finally, valid will be True only if there are no missing parameters.

[6]:
print("required:", conn.required)
print("missing:", conn.missing)
print("valid:", conn.valid)
required: ['req']
missing: ['req']
valid: False

requires

If a parameter requires another, you can specify that. In the following template, the optional=True is ignored, since param says that it requires it.

Added in version 0.13.3: You can now specify multiple requirements with a comma separated list: requires=a,b,c.

[7]:
template = """
param = #PARAM:default={foo}:requires=foo#

foo = #FOO:optional=True#
"""

conn = Computer(template=template)

print(conn.script())
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[7], line 9
      1 template = """
      2 param = #PARAM:default={foo}:requires=foo#
      3
      4 foo = #FOO:optional=True#
      5 """
      7 conn = Computer(template=template)
----> 9 print(conn.script())

File ~/Work/Devel/remotemanager/remotemanager/script/script.py:319, in Script.script(self, empty_treatment, **run_args)
    317 # check validity
    318 if not self.valid:
--> 319     raise ValueError(f"Missing values for parameters:\n{self.missing}")
    320 # generation section
    321 self._link_subs()  # ensure values are properly linked

ValueError: Missing values for parameters:
['foo']

replaces

You can also mark a parameter as replacing another. This is less useful, but if you find a use case, it’s there.

Here, you can specify only param, and foo will not complain.

Added in version 0.13.3: You can now specify multiple replacements with a comma separated list: requires=a,b,c.

[8]:
template = """
param = #PARAM:replaces=foo#

foo = #FOO:optional=False#
"""

conn = Computer(template=template)

print(conn.script(param="param"))

param = param


hidden

Hidden does what it says on the tin, allowing you to hide values but still use them elsewhere.

This is useful if you have configuration values or constants that are to be used in other parameters, but you don’t want them printed in the script.

Warning

Hidden parameters will always use the line method of removal. They can (and will) clobber data that shares a line with them, so make sure that you space them out.

[9]:
template = """
val = #VAL:default={const*2}#

#CONST:value=10:hidden=True#
"""

print(Computer(template=template).script())

val = 20


min/max

Setting the min or max will raise an exception if the value steps out of these bounds (even if calculated)

Here, b returns 10x the value of a, so calling with a=3 will result in a value of b=30.

As this is above our set limit, we will get an exception. The same applies to min, but in reverse.

[10]:
template = """
a = #a#
b = #b:max=20:default={a*10}#
"""

conn = Computer(template=template)

print(conn.script(a=3))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 8
      1 template = """
      2 a = #a#
      3 b = #b:max=20:default={a*10}#
      4 """
      6 conn = Computer(template=template)
----> 8 print(conn.script(a=3))

File ~/Work/Devel/remotemanager/remotemanager/script/script.py:327, in Script.script(self, empty_treatment, **run_args)
    325     value = DELETION_FLAG_LINE
    326 else:
--> 327     value = sub.value  # get the value in string form
    328 if value is None or value == "None":
    329     # no value, triage this argument
    330     treatment = empty_treatment or sub.empty_treatment

File ~/Work/Devel/remotemanager/remotemanager/connection/computers/substitution.py:172, in Substitution.value(self)
    166 @property
    167 def value(self) -> any:
    168     """
    169     Returns:
    170         value if present, else default
    171     """
--> 172     val = super().value
    174     if len(self.dependencies) == 0:
    175         return val

File ~/Work/Devel/remotemanager/remotemanager/connection/computers/dynamicvalue.py:356, in DynamicMixin.value(self)
    352     pass
    354 val = self._format_value(val)
--> 356 self._validate(val)
    358 return val

File ~/Work/Devel/remotemanager/remotemanager/connection/computers/dynamicvalue.py:441, in DynamicMixin._validate(self, value)
    437     raise ValueError(
    438         f"{value}{nameinsert} is less than minimum value {self.min}"
    439     )
    440 if self.max is not None and value > self.max:
--> 441     raise ValueError(
    442         f"{value}{nameinsert} is more than maximum value {self.max}"
    443     )

ValueError: 30 for b is more than maximum value 20

format

Format allows you to enforce the format of the variable

The possible values are:

  • time

  • float

time

This will enforce a HH:MM:SS format for the input. By default it’s expecting integer seconds, but also accepts a string input or “semantic time”. For example:

"01:00:00" == "1h" == 3600 -> 01:00:00

"24:00:00" == "1d" == 86400 -> 24:00:00

Added in version 0.11.15.

Enabled the ability to set “semantic time” with 4h, 1d, etc.

float

Float will enforce a value to float.

By default all numerical inputs are passed through a math.ceil and converted to int.

This prevents two issues:

  • Integers are generally preferred in jobscripts, requesting resources with floats may lead to unintended behaviour

  • Small fractions will not resolve to 0, requesting at least 1 of the resource in question

Lets say we have a calculation that requests nodes based on the total number of tasks and the number of cores that the nodes have. In this situation we can reasonably see that we could accidentally request 0 nodes, if we ask for less than a full node:

TASKS/CORES_AVAILABLE = NODES

Resolves to 64/128=0.5

So instead of requesting nodes=0.5 (or nodes=0), we convert this to nodes=1

However, if you want a float, you can set format=float, which will enforce that the value is printed as a float.

[11]:
template = """
time = #walltime:format=time#

num_flt = #num_flt:format=float#
num_int = #num_int#
"""

conn = Computer(template=template)

conn.walltime = "24h"
conn.num_flt = 3.0
conn.num_int = 3.0

print(conn.script())

time = 24:00:00

num_flt = 3.0
num_int = 3

static

Sometimes you need to have {} in your output, however you may have found that this causes the contents to be evaluated. If that’s the case, you can force that value to be “static” by passing that keyword.

[12]:
template = """
a = #a:default=10#

b = #b:default={a}#
c = #c:default={a}:static=True#
"""
print(Computer(template=template).script())

a = 10

b = 10
c = {a}

empty_treatment

This argument dictates how empty values are treated. It can be applied individually to the parameters, or globally to the Computer.

Since the default behaviour is to remove lines with empty values, you do not need to worry about over parameterising a jobscript. Provided you keep in mind the behaviours described below, we can envision a template that contains multiple times more content than any individual jobscript it may produce.

The possible values are:

  • line

  • ignore

  • local

We can demonstrate this with a parameter #NODES#, and see what happens if we leave no value

line

This is the default behaviour, and removes the whole line.

[13]:
template = "#SBATCH nodes = #NODES:empty_treatment=line#"
print(Computer(template=template).script())

local

Local removes the actual arg itself, leaving the rest of the line.

Useful for when you have multiple parameters in the same line.

[14]:
template = "#SBATCH nodes = #NODES:empty_treatment=local#"
print(Computer(template=template).script())
#SBATCH nodes =

ignore

This behaviour intentionally does nothing, leaving everything untouched:

[15]:
template = "#SBATCH nodes = #NODES:empty_treatment=ignore#"
print(Computer(template=template).script())
#SBATCH nodes = #NODES:empty_treatment=ignore#

Here, we can see all the behaviours in one script:

[16]:
template = """
line = #line:empty_treatment=line#
local = #local:empty_treatment=local#
ignore = #ignore:empty_treatment=ignore#
"""

print(Computer(template=template).script())

local =
ignore = #ignore:empty_treatment=ignore#

Templating

Lets apply the initial script to a Computer and put it to use.

Note

Again, we need to set the submitter parameter.

[17]:
from remotemanager import Dataset

conn = Computer(template=jobscript_template, submitter="sbatch")

Arguments

Checking the arguments property, we can see a list of everything that the connection is expecting to be parameterised.

This is similar to the list you get when querying required and missing, and also has aliases at args and subs.

[18]:
conn.arguments
[18]:
['tasks_per_node',
 'cpus_per_task',
 'nodes',
 'queue',
 'account',
 'time',
 'modules']

Lets add this connection to a dataset and set the parameters that it’s expecting.

Note

Parameters can be set at the URL (Computer), Dataset, Runner, or run() level.

[19]:
modules = """
module load python
module load module/version
"""

def f(inp):
    return inp

ds = Dataset(f,
             url=conn,
             account="myuser",
             modules=modules,
             skip=False)

ds.append_run({"inp": True}, mpi_per_node=64, omp=4, nodes=4)

ds.run(dry_run=True)
appended run runner-0
Running Dataset
assessing run for runner dataset-9ebf1589-runner-0... running
launch command: cd temp_runner_remote && bash dataset-9ebf1589-master.sh
[20]:
print(ds.runners[0].jobscript.content)
#!/bin/bash

#SBATCH --nodes=4
#SBATCH --account=myuser
#SBATCH --walltime=01:00:00
#SBATCH --exclusive


module load python
module load module/version

export DIR_7f3744ae=7f3744ae_master
source $DIR_7f3744ae/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out

And there we have a sensible jobscript.

run_args

It was noted earlier, but it’s worth covering in more detail. The actual arguments that are required are called “run args”, usually acessible at the run_args property where relevant.

You can set these at multiple different levels, which each sucessive one overriding the previous.

Set location

Note

Computer

Top level setting, global, but overidden by everything.

Dataset

Global level defaults, will override any URL level params.

Runner

Specific to that runner, will override any Dataset level params.

run()

Global, but specific to that run. Overrides all parameters.

For more information on the specifics of run args and Datasets, see the Run Args Tutorial.

We can demonstrate this behaviour with a simple script:

[21]:
mini = Computer(template = "#TEST#")

mini.test = "conn level"

ds = Dataset(f, url=mini, skip=False, verbose=0)
ds.append_run({"inp": True})

ds.run(dry_run=True)

print(ds.runners[0].jobscript.content)
conn level
export DIR_3d536b34=3d536b34_master
source $DIR_3d536b34/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out

[22]:
ds.set_run_arg("test", "Dataset level")

ds.run(dry_run=True)

print(ds.runners[0].jobscript.content)
Dataset level
export DIR_3d536b34=3d536b34_master
source $DIR_3d536b34/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out

Here we see that the “Dataset level” setting takes priority.

Now lets demonstrate with multiple runners what happens there:

[23]:
ds.append_run({"inp": False}, test = "Runner level")

ds.run(dry_run=True)

print(ds.runners[0].jobscript.content)
Dataset level
export DIR_3d536b34=3d536b34_master
source $DIR_3d536b34/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out

[24]:
print(ds.runners[1].jobscript.content)
Runner level
export DIR_a61456f2=a61456f2_master
source $DIR_a61456f2/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-1-run.py 2>> dataset-9ebf1589-runner-1-error.out

Adding a second runner and setting a value for test there updates it in place, for that Runner.

However now setting it at the run() level will override even this:

[25]:
ds.run(dry_run=True, test="run() level")

print(ds.runners[1].jobscript.content)
run() level
export DIR_a61456f2=a61456f2_master
source $DIR_a61456f2/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-1-run.py 2>> dataset-9ebf1589-runner-1-error.out

Duplicate Arguments

The template extraction will function on the first argument found within a script, so any kwargs should be added there.

Added in version 0.11.15: BaseComputer will raise an exception if arguments are detected in any arguments but the first.

[26]:
template = """
#test:default=True#

#test:default="foo"#
"""

temp = Computer(template=template)

print(temp.test.value)  # note how the value is "True", not "foo"
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[26], line 7
      1 template = """
      2 #test:default=True#
      3
      4 #test:default="foo"#
      5 """
----> 7 temp = Computer(template=template)
      9 print(temp.test.value)  # note how the value is "True", not "foo"

File ~/Work/Devel/remotemanager/remotemanager/connection/computers/computer.py:38, in Computer.__init__(self, template, **kwargs)
     34 def __init__(self, template, **kwargs):
     35     # super() behaves strangely with multiple inheritance
     36     # explicitly call the __init__ with self
     37     URL.__init__(self, **kwargs)
---> 38     Script.__init__(self, template=template, **kwargs)

File ~/Work/Devel/remotemanager/remotemanager/script/script.py:80, in Script.__init__(self, template, empty_treatment, **init_args)
     77 self._empty_treatment = None
     78 self.empty_treatment = empty_treatment
---> 80 self._extract_subs()

File ~/Work/Devel/remotemanager/remotemanager/script/script.py:148, in Script._extract_subs(self)
    146 if name in self._subs:
    147     if kwargs is not None and kwargs != "":
--> 148         raise ValueError(
    149             f"Got more kwargs for already registered argument "
    150             f"{name}: {kwargs}"
    151         )
    152     logger.debug("\talready processed, continuing")
    153     continue

ValueError: Got more kwargs for already registered argument test: default="foo"

Escape Sequences

Added in version 0.13.5.

Templates reserve a small selection of characters as “control characters”. An example of this is splitting arguments using the : delimiter.

But what if we need to add one of these characters to our template? There are two ways to indicate that control sequences should be added directly.

Quotation

A simple method to do this is to simply quote the value. Strings inside quotations will not be parsed.

[27]:
template = """
ratio: #ratio:default="a:b"#
equal: #equal:default="a=b"#
"""

test = Computer(template=template)
print(test.script())

ratio: "a:b"
equal: "a=b"

Escaping with \

The backslash () character is a standard escape sequence, and the functionality has been extended to templates.

There exists two methods of adding escape sequences to templates:

  • Specify the template as a “raw” string: r"foo\:bar"

  • “Double-escape” the sequence: "foo\\:bar

Lets repeat the previous example, without quotes:

[28]:
template = r"""
ratio: #ratio:default=a\:b#
equal: #equal:default=a\=b#
"""

test = Computer(template=template)
print(test.script())

ratio: a:b
equal: a=b

[29]:
template = """
ratio: #ratio:default=a\\:b#
equal: #equal:default=a\\=b#
"""

test = Computer(template=template)
print(test.script())

ratio: a:b
equal: a=b

Further control

We have seen hints in this tutorial that values can be linked to one another. The next tutorial will cover this in greater detail, allowing you to get the full potential out of templating.